194 research outputs found

    Ambient Sound Provides Supervision for Visual Learning

    Full text link
    The sound of crashing waves, the roar of fast-moving cars -- sound conveys important information about the objects in our surroundings. In this work, we show that ambient sounds can be used as a supervisory signal for learning visual models. To demonstrate this, we train a convolutional neural network to predict a statistical summary of the sound associated with a video frame. We show that, through this process, the network learns a representation that conveys information about objects and scenes. We evaluate this representation on several recognition tasks, finding that its performance is comparable to that of other state-of-the-art unsupervised learning methods. Finally, we show through visualizations that the network learns units that are selective to objects that are often associated with characteristic sounds.Comment: ECCV 201

    Visually Indicated Sounds

    Get PDF
    Objects make distinctive sounds when they are hit or scratched. These sounds reveal aspects of an object's material properties, as well as the actions that produced them. In this paper, we propose the task of predicting what sound an object makes when struck as a way of studying physical interactions within a visual scene. We present an algorithm that synthesizes sound from silent videos of people hitting and scratching objects with a drumstick. This algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We show that the sounds predicted by our model are realistic enough to fool participants in a "real or fake" psychophysical experiment, and that they convey significant information about material properties and physical interactions

    Spatial cues alone produce inaccurate sound segregation: The effect of interaural time differences

    Get PDF
    To clarify the role of spatial cues in sound segregation, this study explored whether interaural time differences (ITDs) are sufficient to allow listeners to identify a novel sound source from a mixture of sources. Listeners heard mixtures of two synthetic sounds, a target and distractor, each of which possessed naturalistic spectrotemporal correlations but otherwise lacked strong grouping cues, and which contained either the same or different ITDs. When the task was to judge whether a probe sound matched a source in the preceding mixture, performance improved greatly when the same target was presented repeatedly across distinct distractors, consistent with previous results. In contrast, performance improved only slightly with ITD separation of target and distractor, even when spectrotemporal overlap between target and distractor was reduced. However, when subjects localized, rather than identified, the sources in the mixture, sources with different ITDs were reported as two sources at distinct and accurately identified locations. ITDs alone thus enable listeners to perceptually segregate mixtures of sources, but the perceived content of these sources is inaccurate when other segregation cues, such as harmonicity and common onsets and offsets, do not also promote proper source separation

    Summary statistics in auditory perception

    Get PDF
    Sensory signals are transduced at high resolution, but their structure must be stored in a more compact format. Here we provide evidence that the auditory system summarizes the temporal details of sounds using time-averaged statistics. We measured discrimination of 'sound textures' that were characterized by particular statistical properties, as normally result from the superposition of many acoustic features in auditory scenes. When listeners discriminated examples of different textures, performance improved with excerpt duration. In contrast, when listeners discriminated different examples of the same texture, performance declined with duration, a paradoxical result given that the information available for discrimination grows with duration. These results indicate that once these sounds are of moderate length, the brain's representation is limited to time-averaged statistics, which, for different examples of the same texture, converge to the same values with increasing duration. Such statistical representations produce good categorical discrimination, but limit the ability to discern temporal detail.Howard Hughes Medical Institut

    TMEM106B is a genetic modifier of frontotemporal lobar degeneration with C9orf72 hexanucleotide repeat expansions

    Get PDF
    Hexanucleotide repeat expansions in chromosome 9 open reading frame 72 (C9orf72) have recently been linked to frontotemporal lobar degeneration (FTLD) and amyotrophic lateral sclerosis, and may be the most common genetic cause of both neurodegenerative diseases. Genetic variants at TMEM106B influence risk for the most common neuropathological subtype of FTLD, characterized by inclusions of TAR DNA-binding protein of 43 kDa (FTLD-TDP). Previous reports have shown that TMEM106B is a genetic modifier of FTLD-TDP caused by progranulin (GRN) mutations, with the major (risk) allele of rs1990622 associating with earlier age at onset of disease. Here, we report that rs1990622 genotype affects age at death in a single-site discovery cohort of FTLD patients with C9orf72 expansions (n = 14), with the major allele correlated with later age at death (p = 0.024). We replicate this modifier effect in a 30-site international neuropathological cohort of FTLD-TDP patients with C9orf72 expansions (n = 75), again finding that the major allele associates with later age at death (p = 0.016), as well as later age at onset (p = 0.019). In contrast, TMEM106B genotype does not affect age at onset or death in 241 FTLD-TDP cases negative for GRN mutations or C9orf72 expansions. Thus, TMEM106B is a genetic modifier of FTLD with C9orf72 expansions. Intriguingly, the genotype that confers increased risk for developing FTLD-TDP (major, or T, allele of rs1990622) is associated with later age at onset and death in C9orf72 expansion carriers, providing an example of sign epistasis in human neurodegenerative disease

    Optimasi Portofolio Resiko Menggunakan Model Markowitz MVO Dikaitkan dengan Keterbatasan Manusia dalam Memprediksi Masa Depan dalam Perspektif Al-Qur`an

    Full text link
    Risk portfolio on modern finance has become increasingly technical, requiring the use of sophisticated mathematical tools in both research and practice. Since companies cannot insure themselves completely against risk, as human incompetence in predicting the future precisely that written in Al-Quran surah Luqman verse 34, they have to manage it to yield an optimal portfolio. The objective here is to minimize the variance among all portfolios, or alternatively, to maximize expected return among all portfolios that has at least a certain expected return. Furthermore, this study focuses on optimizing risk portfolio so called Markowitz MVO (Mean-Variance Optimization). Some theoretical frameworks for analysis are arithmetic mean, geometric mean, variance, covariance, linear programming, and quadratic programming. Moreover, finding a minimum variance portfolio produces a convex quadratic programming, that is minimizing the objective function ðð¥with constraintsð ð 𥠥 ðandð´ð¥ = ð. The outcome of this research is the solution of optimal risk portofolio in some investments that could be finished smoothly using MATLAB R2007b software together with its graphic analysis

    Search for supersymmetry in events with one lepton and multiple jets in proton-proton collisions at root s=13 TeV

    Get PDF
    Peer reviewe

    Search for anomalous couplings in boosted WW/WZ -> l nu q(q)over-bar production in proton-proton collisions at root s=8TeV

    Get PDF
    Peer reviewe
    corecore